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Abstract — One of the main theoretical motivations for the 
emerging area of network coding is the achievability of the max- 
flow/min-cut rate for single source multicast. This can exceed the 
rate achievable with routing alone, and is achievable with linear 
network codes. The multi-source problem is more complicated. 
Computation of its capacity region is equivalent to determination 
of the set of all entropy functions r*, which is non-polyhedral. 
The aim of this paper is to demonstrate that this difficulty can 
arise even in single source problems. In particular, for single 
source networks with hierarchical sink requirements, and for 
single source networks with secrecy constraints. In both cases, 
we exhibit networks whose capacity regions involve r*. As in the 
multi-source case, linear codes are insufficient. 



I. Introduction 

Network coding [1], [2] generalizes routing by allowing in- 
termediate nodes to perform coding operations which combine 
received data packets. One of the most celebrated benefits of 
this approach is increased throughput in multicast scenarios. 
This stimulated much of the early research in the area. One 
fundamental problem in network coding is to understand the 
capacity region and the classes of codes that achieve capacity. 
In the single session multicast scenario, the problem is well 
understood. In particular, the capacity region is characterized 
by max-flow/min-cut bounds and linear network codes are 
sufficient to achieve maximal throughput [2], [3]. Network 
coding not only yields a throughput advantage over routing, its 
capacity can be easily determined, and easily achieved. This is 
in stark contrast to routing, where computation of the capacity 
region and of optimal routes is fundamentally difficult. 

Significant practical and theoretical complications arise in 
more general multicast scenarios, involving more than one 
session. An expression for the capacity region is known [4], 
however it is given by the intersection of a set of hyperplanes 
(specified by the network topology and connection require- 
ment) and the set of entropy functions T*. Unfortunately, 
this capacity region, or even the inner and outer bounds [5]- 
[7] cannot be computed in practice, due to the lack of an 
explicit characterization of the set of entropy functions for 
more than three random variables. This difficulty is not simply 
a consequence of the particular formulation of the capacity 
region given in [4]. It was recently shown that the problem of 
determining the capacity region for the multi-source problem 
is in fact entirely equivalent to the determination of f*, 
the set of almost entropic functions [8]. Furthermore, the 



non-polyhedral nature of f *, revealed in [9] implies a non- 
polyhedral capacity region (in contrast to the max-flow result 
for single sources). To make things even worse, it is also 
known that linear network codes are not sufficient for the 
multi-source problem [3], [8]. 

In this paper, we show that non-polyhedral capacity regions 
can occur even in single source scenarios. We demonstrate 
this phenomenon for single source networks with hierarchical 
sink constraints, and for single source networks with security 
constraints. Our approach is in the spirit of our recent work 
[8], which revealed a deep duality between network codes 
and entropy functions. Direct consequences are non-polyhedral 
capacity regions, the insufficiency of linear network codes and 
the importance of non-Shannon information inequalities. 

Section [II] provides the basic setup for secure network 
codes, and formally defines achievability and admissibility for 



networks with wiretapping adversaries. Section III focuses on 
the single source incremental multicast scenario, in which the 
sinks have hierarchical requirements. Given a function g, we 
construct an incremental multicast network that is solvable if 



and only if g is entropic. In Section IV we construct a special 
single source secure multicast problem which is equivalent to 
an insecure multi-source multicast problem. Invoking the du- 
ality results from [8] these constructions relate the solvability 
of both single-source incremental multicast and single source 
secure multicast, to multi-source multicast problems. 

II. Background 

The network topology will be modeled by a directed acyclic 
graph Q = (V,£). Vertices u 6 V correspond to communi- 
cation nodes and directed edges e € £ are error- free point- 
to-point communication links. The connection requirement 
M = (S,0,T>) is specified by three components. The set 
S indexes the independent multicast sessions, each of which 
is a collection of packets to be multicast to a prescribed set of 
destinations. The session-source location mapping O : S ^ V 
specifies the originating node O(s) for session s. The receiver- 
location mapping T> : S i— ► 2 V indicates the set of nodes 
T>(s) C V which require the data of session s. 

A network code is identified by a set of discrete random 
variables {T$, We}, defined on finite sample spaces, where for 
concise notation, set-valued subscripts denote a set of objects 
indexed by the set, e.g. Zx = {Zi,i £ X}. The source 
random variables T s , s £ S are mutually independent and 
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are uniformly distributed on sample spaces whose size will 
be denoted \T S \. The variables W e ,e £ £ are the messages 
transmitted over link e. 

Since the network is acyclic, variables in Tg and We can 
be ancestrally ordered according to the network topology. 
Causal coding requires that edge messages are conditionally 
independent of their non-incident ancestral messages given 
their incident source and message variables. 

Definition 1: A network code is probabilistic if there exists 
an outgoing link message which is not a function of the incom- 
ing source and link messages. Otherwise, it is deterministic. 

Probabilistic network codes can be implemented via using 
independent random variables V u (internal randomness) at 
each node u £ V such that all outgoing messages from a 
node are deterministic functions of incoming sources and link 
messages and the independent randomness generated at the 
node. It is easy to prove that all probabilistic network codes 
can be implemented in this way. Accordingly, we shall specify 
a probabilistic network code by the set {T$, Ws, V-p}. 

Lemma 1: Given random variables Xi,X 2 and V, if V is 
independent of X\ and X 2 , and X 2 is a function of X\ and 
V, then X 2 is a function of X\ alone 

The implication of the lemma is as follows. At the sinks (or 
any intermediate node) of the network, if reconstruction of the 
source messages is possible, then it can also be achieved in 
the absence of "internal randomness". In fact, in the absence 
of security constraints, it is known that deterministic network 
codes are sufficient [6]. This is not always the case for the 
wiretapping scenarios considered in Section [TV] 

In addition to legitimate sinks, there are \R\ adversaries, 
which can eavesdrop any message transmitted along a given 
collection of links. Each adversary attempts to reconstruct a 
particular set of source messages, according to a wiretapping 
pattern. 

Definition 2 (Wiretapping pattern): The wiretapping pat- 
tern is specified by a collection of tuples (A r ,B r ) for r £ 1Z 
such that A r C S is the subset of sources to be reconstructed 
by adversary r, which observes only the links in B r . 

For a given network code designed with respect to a 
connection requirement M, define P e as the error probability 
that at least one receiver fails to correctly reconstruct one or 
more of its requested source messages. A zero-error network 
code is one for which P e = 0, and hence the source messages 
Tjj can be perfectly reconstructed at desired sinks. The goal 
of secure communications is to transmit information such that 
any eavesdropper listening to the traffic on all the links in 
B r remains "ignorant" of the data transmitted by the sources 
in A r - A perfectly secure network code is one for which the 
information leakage I (T_4 r ; Wsf) = for all r £ 7Z. 

Definition 3 (Admissible rate-capacity tuple): Given a net- 
work Q = (V,£) and a connection requirement M, a rate- 
capacity tuple (X,u>) = (Xs,ooe) is admissible if there exists 
a perfectly secure, zero-error network code $ = {Wf,f £ 
S(J£}, such that 

H(W e ) <log|w e | < uj e , Ve££, 
H{T s )=log\w s \>X s , VseS, 



where W e is the message symbol transmitted along link e and 
T s is the input symbol generated at source s. 

The preceding definitions consider zero-error network codes 
and perfect security. Relaxing these requirements prompts the 
following definition. 

Definition 4 (Achievable): A rate-capacity tuple (A, id) is 
achievable if there exists a sequence of network codes 
and normalizing constants r(n) > such that 
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In the absence of any security constraints, \7Z\ = 0, these 
definitions reduce to the usual ones and the multi-source, 
multi-sink capacity region is given by [4]. Bounds for the 
multi-source multi-sink scenario with wiretappers were given 
in [10]. 



III. Incremental Multicast 

In this section, we study a the special case of incremental 
multicast, meaning that the session indexes are totally ordered 
such that a receiver requesting a particular session also re- 
quests all sessions with lower index. We consider the simplest 
incremental multicast scenario, with only two source messages 
and no secrecy constraints (permitting deterministic codes). 
We will show that determining the capacity region, even in 
such a simple scenario, can be no simpler than solving the 
general multicast problem. 

o N 

Our approach is inspired by [8]. Let H[M] C K with 
coordinates indexed by proper subsets of a ground set M. 
with N elements. Points h £ H[A4] can be regarded as 
functions, h : 2 M i-> K with h{%) = 0. Given such an 
h £ H[A4] we will construct a special network Q\ an 
incremental connection requirement and a rate-capacity 
tuple T(h) that is admissible if and only if h is entropic. 

The network topology, connection requirement and link 
capacities are defined in Figure [T] which for convenience, 
is divided into several subnetworks. The single source node 
is an open circle, labelled with the two available sessions 



(this node is repeated for convenience in Figures 1(a) 1(b) 
and |l(c)| i. The destinations are double circles, labelled with 
their requirements. Intermediate nodes are solid circles. The 
source and sink labels define the mappings O and T>. Each 
capacitated edge is labeled with a pair of symbols denoting 
the edge capacity, and the edge message (and corresponding 
random variable). Unlabelled edges are assumed to be unca- 
pacitated, or to have a finite but sufficiently large capacity to 
losslessly forward all received messages. 

contains 



The first part of the network, shown in Figure 1(a) 



the source where there are two independent sessions (i.e., two 
messages Sq and Si) available. The desired source rates asso- 
ciated with Sq and Si are respectively J2iej^ M*) anc l h(Af). 
There are 2N specific edge messages that are of particular 
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interest. Rather than naming all edge variables W e ,e € £, 
we label these 2N particular edge variables Uj and Vj for 
j = 1, . . . , N. Remaining edge variables will be labelled with 
generic symbols Wi indexed by an integer i. 

In Figure 1(a) the source node generates from So 



and Si respectively the sets of network coded messages 
{Ui, U 2 , . . . , Un} and {Vi, V 2 , . . ., Vjy} which are duplicated 
as required and forwarded to the rest of the network. The 
remainder of the network is divided into subnetworks of two 



types, shown in Figures 1(b) and 1(c) 




o 

5(1, Si 



/.(.) + h(A') - /.(„), Hi \ 



(a) Source node 



So, ft (y 



So, Si 

(b) Type 1 subnetworks 



T 



V a V a U, : j ± i 

(c) Type 2 subnetworks 



Fig. 1. The network . 



With reference to Figure 1(b) there are 2^ — 1 type 1 sub- 
networks, one for each nonempty a € 2^. These subnetworks 
introduce an edge of capacity h(J\f) — h(a) between the source 
and a sink requiring Si . There is an intermediate node which 
has another \a\ incident edges (from Figure 1(a) I, carrying 



Va — {Vjij G The intermediate node then has an edge of 
capacity h(a) to the sink. 

Figure |l(c) shows the structure of type 2 subnetworks, 
which are indexed by ^ a C TV and an element i E Af,i ^ 
a. Each type 2 subnetwork connects the source to the upper 
receiver. In addition, there are other incident edges carrying 
{Vj : j £ a} and {Uj : j E Af}. For notational simplicity, we 
have written h(aU {i}) = h(a, i). 

So far, we have described a network Q\ a connection 
requirement M' and have assigned rates to sources and 
capacities to links. Clearly M* depends only on N, and not in 
any other way on h. Similarly, the topology of the network 
depends only on N. The choice of h affects only the source 
rates and edge capacities, which are collected into the rate- 
capacity tuple T(h). Also, we can assume without loss of 
generality that T(h) is a linear function of h. 

Definition 5: A function h E H[Af] is called entropic if 
there exists discrete random variables Xi,. . . ,Xn such that 



the entropy of {Xi : i E a} is equal to h(a) for all ^ a C 
Af. Furthermore, h is called quasi-uniform if any subset of the 
variables are uniform over their support. 

Theorem 1: For the network and a connection require- 
ment M\ if a rate-capacity tuple T(h) is admissible, then h 
is quasi-uniform and hence entropic. 

Proof: Suppose that T(h) is admissible. By Definition |5] 
admissibility of T(h) on Q\ M' requires the existence of a 
zero-error network code $ with source messages S[ a ], ^ 
a C Af and a subset of its coded messages Ujsf and VV- Given 
this hypothesis, we will show that h is the entropy function 
of Vjv> an d that V)v is quasi-uniform. 



First focus on Figure 1(a) Applying min-cut bounds, it is 
straightforward to prove 

H{Ui) = h(i),Vi€Af, 
H{V M ) = h(Af). 
H(Vi) - h(i),Vi E Af. 

Similarly, applying min-cut bounds to type 1 subnetworks of 



Figure 1(b) H(V a ) > h(a),<D ^ aCAf. 



We now focus on type 2 subnetworks of Figure 1(c) and 
aim to prove that H(V a ) < h(a) for any ^ a C Af. In 
order for the upper receiver to reconstruct So and Si, 

H(Wi,W 2 ) + h(Af) + h (j) - 2/i(t) > H(S , Si) 

or equivalently, H(Wi, W2) > 2h(i). In addition, 

H(Wi,W 2 ) < H(Ui,Vi,Wi,W 2 ) 
= H(Ui,Vi)<2h(i). 

As a result, H(W ll W 2 ) = H(U h Vi, W x , W 2 ) which further 
implies that Vi is a function of Wi,W 2 . Thus Vi can be 
recovered at P - On the other hand, from the lower part of 
the subnetwork, 

H(Ui\W 3 ) = Ihl , ir,. / ^ i) + I(U ti Uj,j^i\W 3 ) 

( => I(Ui;Uj,j^ i\W 3 ) 
<I(Ui,W 3 ;Uj,j^i) = 

where (a) follows from the fact that 5 can be reconstructed at 
the lower receiver. This implies that Ui can be reconstructed at 
Pi. From [8], that Pn can decode Vi and that Pi can decode Ui 
further implies H(Vi\V a ) — h(a,i) — h(a). By mathematical 
induction (similar to the proof of [8, Theorem 1]), the only 
solution that satisfies all of the conditions above is when the 
entropy function of V^ is equal to h. 

Finally, from type 1 subnetworks, the support of V a is 
at most 2 ht - a > . Hence, V a is indeed quasi-uniform (this also 
implies that the U are quasi-uniform, via H(Ui) = H(Vj) = 
h(i) and the independence of the Uf). ■ 

Theorem 2 (Converse): For the network Q' and a connec- 
tion requirement M', a rate-capacity tuple T(h) is admissible 
if h is quasi-uniform. 

From Theorems [TJ and [2] we can follow the approach in [8] 
and easily extend the result to almost entropic functions. 
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Theorem 3: For the network Cfi and a connection require- 
ment M\ a rate-capacity tuple T(h) is achievable if and only 
if h is almost entropicQ 

IV. Secure Multicast 

Linear network codes (for single source multicast) that are 
resilient to eavesdropping are considered in [11]. Sufficient 
conditions for the existence of such codes was also derived. 
This was further generalized in [12] to multi-source cases. A 
similar result was also obtained in [13] which gives necessary 
and sufficient conditions under which transmitted data are 
safe from being revealed to eavesdroppers. All of the above- 
cited works assume that the wiretapper aims to reconstruct 
all sources. Similar results have been obtained where only a 
subset of sources are to be reconstructed [14]. Inner and outer 
bounds to the secure capacity region were given in [10]. 

We will now show that even for a simple single-session 
secure multicast problem, determination of the capacity region 
can be extremely hard. In particular, the problem is at least as 
hard as any multi-source multi-session multicast problem. 

Figure [2] shows the construction for a network Q* . The 
source message is X whose rate is d. The link capacities are 
parametrized by < c < d. There is a single eavesdropper 
who only observes the message variable W 3 . Thus Figure [2] 
also specifies M*, and the wiretapping pattern A*,B*. 




d-c,W 2 



Fig. 2. The network Q*. 

Proposition 1: Given network Q* and connection (and se- 
crecy) requirement Al* depicted in Figure [2] if a rate-capacity 
tuple T(/i) is admissible then K is a function of W4. 

Proof: From the capacity constraint on Q*, we have 

H(W 1 ,W 2 ) < H(Wi) + H(W 2 ) 
= c+ d — c 
= H(X). 

Together with the decodability requirement, H(X\Wi, W 2 ) = 
0, we have 



H{W 2 ) 



H(W!,W 2 ) =H(Wi) 
H(W 1 ,W 2 \X) = 

H(W!,W 2 ) =H{X) 
H(Wi) = c 
H(W 2 ) =d-c. 



1 A function h is almost entropic if it is the limit of a sequence of entropic 
functions. 



Applying a min-cut bound on the set of edge variables 
{W2,Ws}, we can also prove that ff(Ws) = c and 
H{W$\X) — 0. On the other hand, the secrecy constraint 
requires /(W3; X) = and hence 

J(Wi;W 8 ) = (1) 

as Wi is a function of X. 

Now, we will show that H(W 1 \W 3 , W A ) = 0. First, 

I(W 3 , W±; X) ( = } I(W 3 , W 4 ; W U X) 

= I(W 3 , W 4 ; Wi) + I(W 3 , W 4 ; X\W X ) 

( = } I{W 3 , Wr, w x ) 

where (a) follows from the fact that W\ is a function of X 
and (b) follows from the conditional independence implied by 
the underlying network topology. Using the same argument, 
we can also prove that I(K, W 3 ; X) = I(K, W 3 ; Wi). 

Since W5 is a function of X and is thus independent of 
internal randomness, Lemma[TJimplies that ^(WslW^, W4) 
( ) Together with H(Wc,) = c, we have 

I(Ws,Wi]Wi) = I(W s ,Wi]X) 
> I(W 3 ,W 4 ;W 5 ) 
= H{W 5 ) = c. 

Since H(W X ) = c, it implies that H{W 1 \W 3 ,W i ) = 
or equivalently that W\ is a function of W 3 and W4. Sim- 
ilarly, using the same argument, we can also prove that 
H(Wi\K,W a )=0. 

Our final aim is to show that H{K) = H{W i ) = c 
and H(K, W 4 ) = 2c. Clearly, both H(K) and H(W 4 ) are 
bounded above by c due to the edge capacity constraint. We 
obtain a lower bound on the entropy of K as follows. 

H(K) > I(K; Wi\W 3 ) 

= I(K; Wi\W 3 ) + H{W X \K, W 3 ) 
= H(W 1 \W 3 ) 

( = } H(Wi) = c 

where (a) follows from (fij. Hence, H(K) = c. And similarly, 
we can also prove that H(W 4 ) — c. 
Independence of W\ and W 3 implies 

H{K\W U W 3 ) = H(Wi,K, W 3 ) - H{Wi, W 3 ) 
= H{K,W 3 )~H{W ll W 3 ) 
= H{K,W 3 )-H{W{)-H{W 3 ) 



(a) 



H{W 3 \K) - H{W 3 ) < 0, 



where (a) follows from H(Wi) — H(K) = c. Consequently, 
H(K\Wi,W 3 ) =0. 

Similarly, H(W 4 \Wi, W 3 ) = 0. Finally, 

2c>H{W 1 ,W 3 ) 

= H(Wi,K,W 3 ,W 4 ) 
> H(W U K,W4,) 

= H(Wi) + H{K, Wi) > 2c 
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where (a) follows from independence of W\ and (K, W4). 
Hence, H(K, W4) = c which further implies if(i\~|W4) = 
H(W 4 \K) = 0. ■ 

Under a regularity condition (that 2 C and 2 d are integers), 
the converse of Proposition [T] also holds. 

Proposition 2 (Converse): For the network Q* with connec- 
tion (and secrecy) requirement M*, the specified rate-capacity 
tuple is admissible if a secret key of a rate c can be transmitted 
from the node Pq to P\. 

Essentially, Propositions [TJ and [2] suggest that the admissi- 
bility of the single source secure multicast problem depends 
on communication of a secret key from Pq to Pj. Adhering 
several copies of Q* together (see Figure gj, we can easily 
generalize the network such that admissibility implies that 
multiple secret keys must be transmitted across a network. 
This turns the single source secure multicast problem into a 
multi-source multicast. 

Theorem 4: For any multicast problem (without secrecy 
constraints), there exists a corresponding secure multicast 
problem such that the multicast problem is admissible if and 
only if the corresponding secure multicast problem is also 
admissible. Consequently, using the single-source two sessions 
network Q> and a connection requirement M\ there exists a 
secure multicast problem such that a rate capacity tuple T(h) 
is achievable if and only if h is almost entropic. 




Fig. 3. Several copies of Q*. 

V. Implications and conclusion 

Theorems [3] and |4] show that even for a single-source 
network multicast problem with two independent sets of 
messages or for a single source secure multicast problem, 
the determination of the set of achievable rate-capacity tuples 
can be extremely hard. Following the same arguments as used 
in [8], we can also prove the following results for a single- 
source two-session multicast problem or for a single-source 
single-session multicast problem with secrecy constraints: 

1) Capacity regions are not polyhedraQ in general. 

2) LP bounds are not tight in general. 

3) Linear codes are not sufficient to achieve capacity. 

In other words, finding capacity regions for (secure) multicast 
problems seems to be a mission impossible. Not only are the 

2 That the single-source single-session secure multicast problem has a non- 
polyhedral capacity region is somewhat surprising, since the region for the 
same problem without the secrecy constraint is completely determined by the 
min-cut bound 



existing bounding techniques loose, the non-polyhedral nature 
of the capacity region suggests that LP bounds cannot fully 
characterize the region, even with the addition of more and 
more newly discovered information inequalities. Any finite set 
of such new inequalities can only further tighter the bound, 
but can never yield the exact capacity region. 

Despite the hardness of the problem, there are still many 
questions to be answered. It is unclear what makes finding 
the capacity region problem so difficult. In the case of a 
single session multicast or the case where there are only 
two sinks, capacity regions have explicit polyhedral charac- 
terizations provided by min-cut bounds. On the other hand, 
where there are many sinks, the capacity region can be 
extremely complicated to characterize, even if there are only 
two independent sessions. It will be of great importance to 
classify the set of networks and connection requirements that 
lead to polyhedral capacity regions characterized by min-cut 
bounds or LP bounds. 
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